Care to share? Experimental evidence on code sharing behavior in the social sciences

Transparency and peer control are cornerstones of good scientific practice and entail the replication and reproduction of findings. The feasibility of replications, however, hinges on the premise that original researchers make their data and research code publicly available. This applies in particular to large-N observational studies, where analysis code is complex and may involve several ambiguous analytical decisions. To investigate which specific factors influence researchers’ code sharing behavior upon request, we emailed code requests to 1,206 authors who published research articles based on data from the European Social Survey between 2015 and 2020. In this preregistered multifactorial field experiment, we randomly varied three aspects of our code request’s wording in a 2x4x2 factorial design: the overall framing of our request (enhancement of social science research, response to replication crisis), the appeal why researchers should share their code (FAIR principles, academic altruism, prospect of citation, no information), and the perceived effort associated with code sharing (no code cleaning required, no information). Overall, 37.5% of successfully contacted authors supplied their analysis code. Of our experimental treatments, only framing affected researchers’ code sharing behavior, though in the opposite direction we expected: Scientists who received the negative wording alluding to the replication crisis were more likely to share their research code. Taken together, our results highlight that the availability of research code will hardly be enhanced by small-scale individual interventions but instead requires large-scale institutional norms.


Tables
We deleted an empty row (n=1) in the ESS database which had gone unnoticed at the time of preregistration. As prespecified, we treated researchers whom we ultimately could not reach as sample-neutral failure (n=95). We also excluded researchers that appeared in the ESS database without having used ESS data substantially (i.e. overcoverage, n=83). a We consider positive framing as treatment, and negative framing as control. b For all appeal treatments, we consider the neutral baseline request as control. c We consider the neutral baseline request without any information on code cleaning requirements as control.   To further enhance the quality, relevance, and success of social science research, our project aims to assess the reproducibility of randomly selected articles from the European Social Survey's bibliographic database. Would you mind sharing your code with us to make sure your article can be included in our analysis? In case we have overlooked available replication files for your article online, or if you are not the right person to contact, please point us in the right direction.
Please note: • By providing access to your code, you honor the FAIR Guiding Principles (Wilkinson et al. 2016) and showcase your commitment to good scientific practice. The FAIR principles aim to make research more transparent and sustainable, and have been adopted by research institutions worldwide.
• Ideally, you would provide access to the entire code, starting from the publicly available ESS files. This includes everything from initial data preparation to final results. Do not worry about any further preparation or code cleaning, this is not required.
• Should your analysis feature data other than the ESS, please provide access to these data along with your research code, if possible. This will greatly simplify any replication efforts.
• Our primary focus is on aggregated replication rates of published results; we will therefore not disclose your individual code without your permission. Should our results substantially differ from yours, we would be happy if we may get back to you.
Your cooperation is greatly appreciated and will substantially benefit our research.
We have compiled some background information below. Please do not hesitate to contact us with any questions or concerns.
Kind regards on behalf of the entire team, While there is currently much debate surrounding replications, it tends to go unnoticed that studies may not be replicable for a number of reasons. This includes honest errors, different sample or model specifications, different operationalizations, as well as software changes outside of the individual researcher's control. Often, reproductions fail from the outset due to missing data and code, although journals now frequently require researchers to share this material at publication. This project focuses on articles using data from the European Social Survey (ESS). As the ESS constitutes a major public resource that you have benefitted from in the past, we hope you will be sympathetic to our request. We are convinced that this project will provide valuable insights on the replicability of social science research and will benefit both the research community and the general public.

Experimental Condition
Text A. Statistical power considerations. Given our sample size and the observed baseline sharing rate of about 40%, the experiment's statistical power would be rather poor if one assumes the effect to be very small, i.e. our intervention leading to a 5 percentage points increase in code sharing rates (Cohen's d = 0.1). In that case, the statistical power for the effort and framing treatment would be 49.1% (N = 1,028) and only 30.9% for the appeal treatment (N = 514). Assuming a small effect, i.e. our intervention leading to an increase in code sharing rates by 10 percentage points (d = 0.2), statistical power is already quite high (94.3% for the binary treatments, 73.8% for the four-level treatment). Finally, for upper medium (d = 0.4) and large effect sizes (d = 0.8), the statistical power is excellent for all treatments (>99%). Admittedly, these power estimates rely on the assumption of no interaction effects between treatments, i.e. that we tested our main effects in different (independent) samples. If this condition is not met, statistical power would be lower than estimated. However, neither theoretical considerations nor results from the interacted linear probability model reported in Tab. D point towards interactions playing any meaningful role in our study. Taken together, our field experiment's statistical power seems more than adequate (i.e. >80%, [20]) to detect any main effect that would improve code sharing behavior substantially.